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n-f.:  if  '^■tr 


!•   Simunary «   Let  X.  .  (  j=l,  . . .  ,m.  ;  i=l,...,k)   be  independent 
samples  from  the  populations  with  continuous  cumulative 
distribution  functions  F. (x)  =  F(x-C-).   For  testing  the 
hypothesis   H   :  ^   =  ,,,  =  4   against  ^i    '    ^1   =    ""   =   ^k 
(with  at  least  one  inequality  strong)  a  class  of  test  statistics 
based  on  the  ranks  of  the  observations  are  proposed  and  their 
limiting  distributions  are  oerived.   These  results  are  used  to 
obtain  the  asymptotic  relative  efficiencies  (in  the  Pitman 
sense)  with  respect  to  one  another  and  to  the  normal  theory 
competitor. 

2.   The  Pronosed  Test.  Let  the  observable  random  variables 
be   X.  ,   and  supnose  they  are  of  the  form 

(2.1)       X^^   =  ^i  ■*"  U^^(a=l,...,m^ji=l,...,k) 

where  the  variables  U.    are  independently  distributed  with 
common  distribution  F  having  density  f   and  the   ^'s   are 
unknown  constants.   Denote  by  X.   the  vector   (X.,,...,X.   ) 
^nd  suppose  that  the  statistics  h   [defined  below]  is 
calculated  for  every  pair  of  samples j  their  being  k(k-l)/2 
pairs  in  all.   V/e  shall  write   h.  .(X.,X.)   for  the  value 
obtained  from  the  ith  and  jth  samples   (i,  j=l, . .  .,k;  i=j=j) .   V/e 
define 


(2.2)         h.j(X^,Xj)    ^_  .^ 
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where  S   <  ..,  <  S^   denote  the  ranks  of  X.,  ,..,,X.    in 

I-,  \  (m.  +in  . ) 

the  combined  ith  and  jth  samples,  and  where   V.  .'^  <  ...  <  V.  .    ^ 

denote  an  ordered  sample  of  size   (m.  +  m.)   from  a  distribution 

4f.   Then  we  propose  to  consider  the  statistics   V  defined  as 

k-1  k 

(2.3)  V  =  2Z  5Z  U-  • 

i=l  j=i+l  ^J 

for  testing  the  null  hyi^othesis  H   :  ^  =  .»,  =  ^   against 

^2  *  ^1  —  • • •  =  ^k   ^  with  at  least  one  inequality  strong). 
Here 

(2.J+)        U.J  =  (m.  +  mj)(h.j  -  p..j(o))/m.mj 

and  ^^  (o)   is  a  normalizing  constant  for  h. .   specified 
below,  (see  (3,i|)). 

3«   The  Large  Sample  Distribution,   To  obtain  the  large 
sample  distribution  of  the  statistic  V,   we  begin  by  determining 
the  asymptotic  distribution  of  the  statistics  h. .,   This  is 
given  by  the  following  theorem,  where  the  sample  sizes  m. 
are  assumed  to  tend  to  infinity  in  such  a  way  that 
m^  =  ^^.N;N  ->  co  (i=l,...,k). 

Theorem  3,1.  (i)   Suppose  that  the  variables  X.    have  the 
distribution  function  specified  in  connection  with  (2.1)  v.'ith 
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fixed  F   but  a  sequence  of  means   (^  , .,,,4^) 
satisfying 


(3.1) 


1      i' 


Let  h.  .   be  defined  as  in  (£.2)  vjjth  ^      satisfying; 
the  assumptions  of  Theorem  7»1  of  [6]  o   Then  the  variables 
(W^,...,Wj^_^)   Riven  by 

(3.2)         W^  =  N^/^(h^^  -  t^ik^^^^/'^k   i=l..-*k-l 


have   a    joint   asymptotic  noriaal    distribution  as      N   ->  oo  ,      v/ith 
zero  mean  and   covariance  matrix 


(3.3) 


Var(U^)    =  a2^./(^.    +^)^^ 

Cov(W.,W.)    =A^p^p./{p^  -^/^k^^/^j   -"  Pk^Pk 


where 


(3.1|) 


(3.5) 


^^ik^-^   =  ^k,/  -^ 
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dF^(x) 


=  /'j^(x)dx  -   (    /'j(x)dx  j    ;      J  =   ^'-^ 


Here    the   density      f      of     F      is   assumed   to   satisfy  the 
rep;ul£;rity    conditions   of    the    Hodfzes-Leh'nann   lemma    7.2   of    [6] 
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(ii)   For  any   i   and   j 
(3.6)     n1/2u_^^^  N^'^^(U.^  -  Ujj^) 

where  ^^  indicates  that  the  difference  of  the  tv;o  sides  tends 
to  zero  in  probability* 

The  proof  of  (i)  is  given  in  [ij  . 

Now  denoting   (m.  +  m,  )W./m.   by  V.,   it  follows 
immediately  that  the  random  vector   ^^i' * • • »^k-l^   ^^^  ^ 
limiting  normal  distribution  with  zero  mean  vector  and  covariance 
matrix 

Var(V^)  =  (1//D.  +  l//^)A^ 

(3.7) 

Cov(V^,V^)  =  k^/p^ 

Furthermore,    since  under   the    assumed  regularity   conditions 
(3.8) 
^/2  u.j^(^)    -  tiii,(o)l/m^  ->    (a.-a,J    /  idJrF(x)]/dxT-QF(x)    as   N->cd 


N 


(see  for  example  lemma  7.2  of  [6])  it  follows  that  the  random 

1/2 

vector  N  '     (U^.  , , .  ,,U,  _,  ,  )   has  a  limiting  normal  distribution 

with  mean  vector   (i^^  , , . ,  ,yj,  _^  )   vjhere 

yj.  =  (a.  -  a,.)  /  J  dJ[F(x)3/dx>  dF(x)   and  covariance  matrix 

given  by  (3.7)  • 
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Part    (ii)    of  Theorem  3*1   is   a    consequence   of   the   follov;ing 
lenina,    the   proof   of  which  is   given   in    [li]  . 

Lenuna   3«1«      Let      (A,       ^aA      t^\      )      be   a   sequence   of   random 
vectors    such  that   for  any      1   5   i   <   j    5   3>      the   pair      (A.       ,A.      ) 
converr.es    in  law   to    a    bivariate    distribution   v.'ith  mean 

a . .  a.  . 

11   ij 

(r|,,Yi.)   and  covariance  matrix   I         )  •   If  the   o's   satisfy 

(J  ^  '^ij  °jj  ' 

(3.9)  ZZ  a.  .  +  2  ZZoi..  =  0 

then  ^   A.  ->  ^   yi .   in  nrobability » 
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Proof  of  Theorem  3*1  (ii). 

Let  a(^^)  =  n1/2u.  . 

1  iJ 

2  ^^    ^jk 

a(N)  =,  ^1/2^ 
^3     ^^   ^ki 


Using  the  fact  that  U. .  =  constant  -  U...   it  follows 
from  the  above  discussion  that  the  assumptions  of  lemma  3.1 
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are  satisfied  with 


°11  =  (l//^i  -^  iZ/'j)^^   .   ^22  =  (l//?j.  +  1//)^^)..^   , 


a,3  =  -  aV/)^   ,   0^3  =  -  a2/^^ 


so  that  (3,8)  holds.   The  proof  follows. 

We  have  now  proved  that  the  joint  distribution  of  the 
set  of  random  variables   Jn^/^U.  J    is  asymptotically  normal 

with  means  ^    (a.  -  a.) 
matrix 
(3.10) 

Var  •J^''-^/^TT       L     -    /I  /^     u_^/^^,2 


)m  variables       |N  '    \J^.^        is   asymptotically  normal 
)    ^^i    ~   ^j) y|c3.j[F(x)]/dx]dFU)|        and   covariance 
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It  nov;  follows  that   N-^^^V  has  a  limiting  normal 
distribution  with  mean  YZZl    (a-a^  )  /"^jJf  (x^/dxIdF  (x) 

and  variance  (after  omitting  straightforward  computations) 
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k»      Asymptotic  Relative  r.fficiency»  Me   are  no;/  in  a  position 
to  make  large  sample  co-jparisons  betv/een  different  members  of 
the  V  test  and  their  normal  theory  competitor  based  on  the 
student's  statistics.   We  shall  adopt  a  method  due  to  Pitman 
who  defined  the  relative  asymptotic  efficiency  of  two  sequences 
of  tests  as  the  limiting  inverse  ratio  of  sample  sizes  necessary 
to  achieve  the  same  '^ower  against  the  same  sequence  of  alternatives, 
alternative  at  the  same  significance  level. 

Theorem  [;.l.   The  asymptotic  efficiency  of  the  V  test 

relative  to  the  normal  theory  test  based  on  the  statistic 

m. 
1 
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where   a  =  Var (X.^) • 


Proof;   Let  T, ,  =  X.   -  X,    and  v!  =  N^ 


Then  the  variables   (^i»  •  • '^^.i)   have  an  asymptotic  normal 
distribution  v;ith  zero  mean  and  covariance  matrix 
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Var(V.)  =  a'^Cl/^.  +  l/p,^ 
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Furthermore,  since   T^^  =  T^^  -  T   ,   it  follows  that  the 
k(k-l)/2   random  variables  <   N-^'^^^JT..  -  (£,.    -  4.),  i<j  ? 
have  an  asymptotic  normal  distribution  with  zero  mean  and 


covariance  matrix 


Var 


=  o^d/^.  +  1/p^) 


Gov 


1/2 
Hence  N   T  has  a  limiting  normal  distribution  with  mean 


(a.  -  a.)   and  variance   a"^  ^Z  (^^  -  2i  +  l)Vz)-.   Now 
--J  -^  i=l  fi 

proceeding  by  the  standard  arguments,  see  for  example  [3]  or 
PJ ,  the  proof  follows. 

The  relative  efficiency  e^  ^  of  the  iJ/-scores  procedure 
V  to  the  means  procedure   T   is  the  same  as  that  found  by 
Chernoff  and  Savage  fl]  for  the  corresponding  procedures  in  the 
two  sample  problem  and  shown  by  the  author  [6]  to  be  valid  also 
for  the  k-sample  problem  in  the  usual  set  up.   [cf.  Section  5], 

Special  Cases. 

(a)  Let  ^     be  the  rectangular  distribution  R  on 
(0,1).  Then  the  V  test  reduces  to  the  rank  sum  test  V(R). 
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The  efficiency  (I;.l),  then  is  equal  to   12a^(  /  f^(x)dx)^   , 
This  is  knovm  to  satisfy  e^^^^  ^(F)  >  .861|.  for  all  F; 
®V(R),T^^^  =1^^.955  when  F   is  normal;  and  e^^^^  ^(F)  >1 
for  many  non-normal  distributions.   [of.  Hodges  and  Lehmann 
[2]]. 

(b)  Let  \\i     be  the  standard  normal  distribution  £, 
Then  the  V  test  reduces  to  the  normal  scores  V(j^)   test. 
The  efficiency  (l+.l),  then  is  knovm  to  satisfy  e  ,,v   (f)  >  1 
for  all  P  and  e^^^^^^{F)    =  1   if  and  only  if  F   is  normal. 

The  high  value  for  the  minimum  asymptotic  efficiency 
of  the  V(R)-test  relative  to  the  T-test  which  was  first  obtained 
by  Hodges  and  Lehmann  [2j  is  very  reassuring.   In  practical 
terms  it  means  that  in  large  samples  vje  cannot  lose  more  than 
13.6  percent  in  using  the  V(R)-test  rather  than  the  T-test 
for  the  shift  parameters.   On  the  other  hand  we   may  gain 
a  great  deal  since  as  stated  above   e^^^j  ^(F)  >  1  for  many 
non-normal  distributions.   (For  the  Gama  distribution  G  vjith 
parameter  p=l,  e^^^^^^{G)    =3).   If  the  distribution  is 
actually  normal,  the  loss  of  efficiency  is  about  5  percent. 
The  implication  of  the  result  in  (b)  is  far  reaching.   The 
V(^)  test  is  completely  robust  and  has  the  minimum  relative 
efficiency  of  1  compared  to  the  T-test.  It  is  therefore 
difficult  to  make  a  case  for  the  customary  use  of  the  normal 
theory  tests  when  the  sample  sizes  are  reasonably  large.   (For 
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an  efficiency  comparison  of  the  rank  sum  to  the  normal  scores 
procedure  the  reader  is  referred  to  an  interesting  paper  by 
Hodges  and  Lehmann  [3]  )  • 

5«   A  Few  Comments »  The  V  test  presented  in  this  paper  is 
another  extension  by  the  author  to  several  groups  of  a  class  of 
procedures  which  have  been  frequently  recomrr.ended  for  the  usual 
two-sample  problems.   In  [6],  the  author  proposed  a  class 
of  k-sample  tests  i>rhich  included  as  special  cases  the  rank  sum 
i(R)  test,  better  known  as  the  Kruskal-ijallis  K-test,  and  the 
normal  scores  ^(f )  test.   The  alternatives  considered  in  [6] 
were  E^    :    £,^   ^  ^^   j:    , ,  ^   j^  E,^     with  at  least  one  inequality, 
as  opposed  to  the  ordered  alternatives  considered  in  the  present 
paper.   Both  the  classes  of  tests  coincide  for  the  case  k=2, 
but  vrhereas  the  V-test  can  be  regarded  as  a  generalization 
of  the  single-tail  tests,  the  £  test  cannot;  for  in  the  latter 
case,  the  distinction  in  one  and  two-tail  test  is  lost  when 
k>2.   It  \jas  sho\^m  in  [6]  that  (l^,!)  is  also  the  asymptotic 
efficiency  of  the  ^  test  relative  to  the  classical  ^  test.   It 
would  be  interesting  to  compare  (i)  the  V(R)  and^(R)  tests, 
(ii)  the  V(|)  and  ^(|)  tests,  (iii)  the  V(R)  and  ;^(f )  tests, 
and,  V(^)  and  ^{R)    tests.   The  problem  appears  to  be  a  bit 
complicated  because  of  the  fact  that  the  forms  of  their  limiting 
distributions  are  different.  For  fixed  k,  ^   has  a  limiting 
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non-central  chi-square  distribution  and  V   a  limiting  normal 
distribution.   Since  the  corresponding  tests  in  (i)  and  (ii) 
coincide  when  k  =  2,   it  is  conjectured  that  they  v;ill  not 
differ  greatly  when  k  >  2,   In  [?] ,  where  the  present  problem 
is  considered  in  greater  generality,  the  author  has  reviewed 
a  number  of  existing  tests  and  has  made  asymptotic  povjer 
comparisons  in  an  attempt  to  select  the  best  test.   The  results 
will  appear  in  a  subsequent  paper. 

6«   A cknovfle dr.eme nt .   It  is  a  pleasure  to  express  appreciation 
to  Professor  Allan  Birnbaum  for  some  helpful  comments. 
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